Model selection for the rate problem: A comparison of significance testing, Bayesian, and minimum description length statistical inference

نویسندگان

  • Michael D. Lee
  • Kenneth J. Pope
چکیده

One particularly useful but under-explored area for applying model selection in psychology is in basic data analysis. Many problems of deciding whether data have ‘‘significant differences’’ can profitably be viewed as model selection problems. We consider significance testing, Bayesian and minimum description length (MDL) model selection on a common data analysis problem known as the rate problem. In the rate problem, the question is whether or not the underlying rate of some phenomenon is the same in two populations, based on finite samples from each population that count the number of ‘‘successes’’ from the total number of observations. We develop optimal Bayesian and MDL statistical criteria for making this decision, and compare their performance to the standard significance testing approach. A series of Monte-Carlo evaluations, using different realistic assumptions about the availability of data in rate problems, show that the Bayesian and MDL criteria perform extremely similarly, and perform at least as well as the significance testing approach. r 2005 Elsevier Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation on Several Model Selection Criteria for Determining the Number of Cluster

Abstract Determining the number of clusters is a crucial problem in clustering. Conventionally, selection of the number of clusters was effected via cost function based criteria such as Akaike’s information criterion (AIC), the consistent Akaike’s information criterion (CAIC), the minimum description length (MDL) criterion which formally coincides with the Bayesian inference criterion (BIC). In...

متن کامل

A Disease Outbreak Prediction Model Using Bayesian Inference: A Case of Influenza

Introduction: One major problem in analyzing epidemic data is the lack of data and high dependency among the available data, which is due to the fact that the epidemic process is not directly observable. Methods: One method for epidemic data analysis to estimate the desired epidemic parameters, such as disease transmission rate and recovery rate, is data ...

متن کامل

Bayesian approach to inference of population structure

Methods of inferring the population structure‎, ‎its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance‎. ‎In this article‎, ‎first‎, ‎motivation and significance of studying the problem of population structure is explained‎. ‎In the next section‎, ‎the applications of inference of p...

متن کامل

A Geometric Formulation of Occam’s Razor For Inference of Parametric Distributions

I define a natural measure of the complexity of a parametric distribution relative to a given true distribution called the razor of a model family. The Minimum Description Length principle (MDL) and Bayesian inference are shown to give empirical approximations of the razor via an analysis that significantly extends existing results on the asymptotics of Bayesian model selection. I treat paramet...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006